Skip to content

Conversation

@philnik777
Copy link
Contributor

----------------------------------------------
Benchmark                       old        new
--------------------------- ------------------
BM_tolower_char<char>       1.64 ns    1.41 ns
BM_tolower_char<wchar_t>    1.64 ns    1.41 ns
BM_tolower_string<char>     32.4 ns    12.8 ns
BM_tolower_string<wchar_t>  32.9 ns    15.1 ns
BM_toupper_char<char>       1.63 ns    1.64 ns
BM_toupper_char<wchar_t>    1.63 ns    1.41 ns
BM_toupper_string<char>     32.2 ns    12.7 ns
BM_toupper_string<wchar_t>  33.0 ns    15.1 ns

@github-actions
Copy link

github-actions bot commented Jun 23, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@philnik777 philnik777 force-pushed the optimize_toupperlower branch 4 times, most recently from c2a64ef to f80600a Compare June 25, 2025 07:20
@philnik777 philnik777 marked this pull request as ready for review June 25, 2025 14:12
@philnik777 philnik777 requested a review from a team as a code owner June 25, 2025 14:12
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jun 25, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 25, 2025

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes
----------------------------------------------
Benchmark                       old        new
--------------------------- ------------------
BM_tolower_char&lt;char&gt;       1.64 ns    1.41 ns
BM_tolower_char&lt;wchar_t&gt;    1.64 ns    1.41 ns
BM_tolower_string&lt;char&gt;     32.4 ns    12.8 ns
BM_tolower_string&lt;wchar_t&gt;  32.9 ns    15.1 ns
BM_toupper_char&lt;char&gt;       1.63 ns    1.64 ns
BM_toupper_char&lt;wchar_t&gt;    1.63 ns    1.41 ns
BM_toupper_string&lt;char&gt;     32.2 ns    12.7 ns
BM_toupper_string&lt;wchar_t&gt;  33.0 ns    15.1 ns

Full diff: https://github.com/llvm/llvm-project/pull/145344.diff

11 Files Affected:

  • (modified) libcxx/include/__config (-4)
  • (modified) libcxx/include/__locale (-12)
  • (modified) libcxx/include/__locale_dir/locale_base_api.h (-7)
  • (modified) libcxx/include/__locale_dir/support/bsd_like.h (-6)
  • (modified) libcxx/include/__locale_dir/support/linux.h (-6)
  • (modified) libcxx/include/__locale_dir/support/no_locale/characters.h (-6)
  • (modified) libcxx/include/__locale_dir/support/windows.h (-6)
  • (modified) libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.exceptions.nonew.abilist (-2)
  • (modified) libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.noexceptions.nonew.abilist (-2)
  • (modified) libcxx/src/locale.cpp (+22-107)
  • (added) libcxx/test/benchmarks/locale/ctype.bench.cpp (+69)
diff --git a/libcxx/include/__config b/libcxx/include/__config
index af8a297fdf3fd..e5f94d31d8535 100644
--- a/libcxx/include/__config
+++ b/libcxx/include/__config
@@ -639,10 +639,6 @@ typedef __char32_t char32_t;
 #    define _LIBCPP_HAS_C11_ALIGNED_ALLOC 1
 #  endif
 
-#  if defined(__APPLE__) || defined(__FreeBSD__)
-#    define _LIBCPP_HAS_DEFAULTRUNELOCALE
-#  endif
-
 #  if defined(__APPLE__) || defined(__FreeBSD__)
 #    define _LIBCPP_WCTYPE_IS_MASK
 #  endif
diff --git a/libcxx/include/__locale b/libcxx/include/__locale
index 92e45e2531c2a..757a53951f66e 100644
--- a/libcxx/include/__locale
+++ b/libcxx/include/__locale
@@ -589,18 +589,6 @@ public:
 #  endif
   _LIBCPP_HIDE_FROM_ABI const mask* table() const _NOEXCEPT { return __tab_; }
   static const mask* classic_table() _NOEXCEPT;
-#  if defined(__GLIBC__) || defined(__EMSCRIPTEN__)
-  static const int* __classic_upper_table() _NOEXCEPT;
-  static const int* __classic_lower_table() _NOEXCEPT;
-#  endif
-#  if defined(__NetBSD__)
-  static const short* __classic_upper_table() _NOEXCEPT;
-  static const short* __classic_lower_table() _NOEXCEPT;
-#  endif
-#  if defined(__MVS__)
-  static const unsigned short* __classic_upper_table() _NOEXCEPT;
-  static const unsigned short* __classic_lower_table() _NOEXCEPT;
-#  endif
 
 protected:
   ~ctype() override;
diff --git a/libcxx/include/__locale_dir/locale_base_api.h b/libcxx/include/__locale_dir/locale_base_api.h
index bbc30b1cfe03f..8dbc28e839839 100644
--- a/libcxx/include/__locale_dir/locale_base_api.h
+++ b/libcxx/include/__locale_dir/locale_base_api.h
@@ -64,8 +64,6 @@
 // Character manipulation functions
 // --------------------------------
 // namespace __locale {
-//  int     __islower(int, __locale_t);
-//  int     __isupper(int, __locale_t);
 //  int     __isdigit(int, __locale_t);  // required by the headers
 //  int     __isxdigit(int, __locale_t); // required by the headers
 //  int     __toupper(int, __locale_t);
@@ -208,11 +206,6 @@ __strtoull(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
 //
 // Character manipulation functions
 //
-#    if defined(_LIBCPP_BUILDING_LIBRARY)
-inline _LIBCPP_HIDE_FROM_ABI int __islower(int __ch, __locale_t __loc) { return islower_l(__ch, __loc); }
-inline _LIBCPP_HIDE_FROM_ABI int __isupper(int __ch, __locale_t __loc) { return isupper_l(__ch, __loc); }
-#    endif
-
 inline _LIBCPP_HIDE_FROM_ABI int __isdigit(int __ch, __locale_t __loc) { return isdigit_l(__ch, __loc); }
 inline _LIBCPP_HIDE_FROM_ABI int __isxdigit(int __ch, __locale_t __loc) { return isxdigit_l(__ch, __loc); }
 
diff --git a/libcxx/include/__locale_dir/support/bsd_like.h b/libcxx/include/__locale_dir/support/bsd_like.h
index 54eb397358d7a..ac402924709e5 100644
--- a/libcxx/include/__locale_dir/support/bsd_like.h
+++ b/libcxx/include/__locale_dir/support/bsd_like.h
@@ -89,12 +89,6 @@ __strtoull(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
 //
 // Character manipulation functions
 //
-#if defined(_LIBCPP_BUILDING_LIBRARY)
-inline _LIBCPP_HIDE_FROM_ABI int __islower(int __c, __locale_t __loc) { return ::islower_l(__c, __loc); }
-
-inline _LIBCPP_HIDE_FROM_ABI int __isupper(int __c, __locale_t __loc) { return ::isupper_l(__c, __loc); }
-#endif
-
 inline _LIBCPP_HIDE_FROM_ABI int __isdigit(int __c, __locale_t __loc) { return ::isdigit_l(__c, __loc); }
 
 inline _LIBCPP_HIDE_FROM_ABI int __isxdigit(int __c, __locale_t __loc) { return ::isxdigit_l(__c, __loc); }
diff --git a/libcxx/include/__locale_dir/support/linux.h b/libcxx/include/__locale_dir/support/linux.h
index fa0b03c646a2a..23bcf44c31dbf 100644
--- a/libcxx/include/__locale_dir/support/linux.h
+++ b/libcxx/include/__locale_dir/support/linux.h
@@ -116,12 +116,6 @@ __strtoull(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
 //
 // Character manipulation functions
 //
-#if defined(_LIBCPP_BUILDING_LIBRARY)
-inline _LIBCPP_HIDE_FROM_ABI int __islower(int __c, __locale_t __loc) { return islower_l(__c, __loc); }
-
-inline _LIBCPP_HIDE_FROM_ABI int __isupper(int __c, __locale_t __loc) { return isupper_l(__c, __loc); }
-#endif
-
 inline _LIBCPP_HIDE_FROM_ABI int __isdigit(int __c, __locale_t __loc) { return isdigit_l(__c, __loc); }
 
 inline _LIBCPP_HIDE_FROM_ABI int __isxdigit(int __c, __locale_t __loc) { return isxdigit_l(__c, __loc); }
diff --git a/libcxx/include/__locale_dir/support/no_locale/characters.h b/libcxx/include/__locale_dir/support/no_locale/characters.h
index 4fb48ed9ceac1..1281b8bd13094 100644
--- a/libcxx/include/__locale_dir/support/no_locale/characters.h
+++ b/libcxx/include/__locale_dir/support/no_locale/characters.h
@@ -29,12 +29,6 @@ namespace __locale {
 //
 // Character manipulation functions
 //
-#if defined(_LIBCPP_BUILDING_LIBRARY)
-inline _LIBCPP_HIDE_FROM_ABI int __islower(int __c, __locale_t) { return std::islower(__c); }
-
-inline _LIBCPP_HIDE_FROM_ABI int __isupper(int __c, __locale_t) { return std::isupper(__c); }
-#endif
-
 inline _LIBCPP_HIDE_FROM_ABI int __isdigit(int __c, __locale_t) { return std::isdigit(__c); }
 
 inline _LIBCPP_HIDE_FROM_ABI int __isxdigit(int __c, __locale_t) { return std::isxdigit(__c); }
diff --git a/libcxx/include/__locale_dir/support/windows.h b/libcxx/include/__locale_dir/support/windows.h
index 0d3089c150081..0df8709f118d0 100644
--- a/libcxx/include/__locale_dir/support/windows.h
+++ b/libcxx/include/__locale_dir/support/windows.h
@@ -197,12 +197,6 @@ __strtoull(const char* __nptr, char** __endptr, int __base, __locale_t __loc) {
 //
 // Character manipulation functions
 //
-#if defined(_LIBCPP_BUILDING_LIBRARY)
-inline _LIBCPP_HIDE_FROM_ABI int __islower(int __c, __locale_t __loc) { return _islower_l(__c, __loc); }
-
-inline _LIBCPP_HIDE_FROM_ABI int __isupper(int __c, __locale_t __loc) { return _isupper_l(__c, __loc); }
-#endif
-
 inline _LIBCPP_HIDE_FROM_ABI int __isdigit(int __c, __locale_t __loc) { return _isdigit_l(__c, __loc); }
 
 inline _LIBCPP_HIDE_FROM_ABI int __isxdigit(int __c, __locale_t __loc) { return _isxdigit_l(__c, __loc); }
diff --git a/libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.exceptions.nonew.abilist b/libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.exceptions.nonew.abilist
index 679a0626d3268..8c55c4385f6f6 100644
--- a/libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.exceptions.nonew.abilist
+++ b/libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.exceptions.nonew.abilist
@@ -1318,8 +1318,6 @@
 {'is_defined': True, 'name': '_ZNSt3__15alignEmmRPvRm', 'type': 'FUNC'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcE10table_sizeE', 'size': 8, 'type': 'OBJECT'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcE13classic_tableEv', 'type': 'FUNC'}
-{'is_defined': True, 'name': '_ZNSt3__15ctypeIcE21__classic_lower_tableEv', 'type': 'FUNC'}
-{'is_defined': True, 'name': '_ZNSt3__15ctypeIcE21__classic_upper_tableEv', 'type': 'FUNC'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcE2idE', 'size': 16, 'type': 'OBJECT'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcEC1EPKtbm', 'type': 'FUNC'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcEC2EPKtbm', 'type': 'FUNC'}
diff --git a/libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.noexceptions.nonew.abilist b/libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.noexceptions.nonew.abilist
index 02bec0e7cbefb..51caa07a74330 100644
--- a/libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.noexceptions.nonew.abilist
+++ b/libcxx/lib/abi/x86_64-unknown-linux-gnu.libcxxabi.v1.stable.noexceptions.nonew.abilist
@@ -1289,8 +1289,6 @@
 {'is_defined': True, 'name': '_ZNSt3__15alignEmmRPvRm', 'type': 'FUNC'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcE10table_sizeE', 'size': 8, 'type': 'OBJECT'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcE13classic_tableEv', 'type': 'FUNC'}
-{'is_defined': True, 'name': '_ZNSt3__15ctypeIcE21__classic_lower_tableEv', 'type': 'FUNC'}
-{'is_defined': True, 'name': '_ZNSt3__15ctypeIcE21__classic_upper_tableEv', 'type': 'FUNC'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcE2idE', 'size': 16, 'type': 'OBJECT'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcEC1EPKtbm', 'type': 'FUNC'}
 {'is_defined': True, 'name': '_ZNSt3__15ctypeIcEC2EPKtbm', 'type': 'FUNC'}
diff --git a/libcxx/src/locale.cpp b/libcxx/src/locale.cpp
index 30a7a54e1c016..da735865c322c 100644
--- a/libcxx/src/locale.cpp
+++ b/libcxx/src/locale.cpp
@@ -697,6 +697,20 @@ const ctype_base::mask ctype_base::graph;
 
 // template <> class ctype<wchar_t>;
 
+template <class CharT>
+static CharT to_upper_impl(CharT c) {
+  if (c < 'a' || c > 'z')
+    return c;
+  return c & ~0x20;
+}
+
+template <class CharT>
+static CharT to_lower_impl(CharT c) {
+  if (c < 'A' || c > 'Z')
+    return c;
+  return c | 0x20;
+}
+
 #if _LIBCPP_HAS_WIDE_CHARACTERS
 constinit locale::id ctype<wchar_t>::id;
 
@@ -726,48 +740,19 @@ const wchar_t* ctype<wchar_t>::do_scan_not(mask m, const char_type* low, const c
   return low;
 }
 
-wchar_t ctype<wchar_t>::do_toupper(char_type c) const {
-#  ifdef _LIBCPP_HAS_DEFAULTRUNELOCALE
-  return std::__libcpp_isascii(c) ? _DefaultRuneLocale.__mapupper[c] : c;
-#  elif defined(__GLIBC__) || defined(__EMSCRIPTEN__) || defined(__NetBSD__) || defined(__MVS__)
-  return std::__libcpp_isascii(c) ? ctype<char>::__classic_upper_table()[c] : c;
-#  else
-  return (std::__libcpp_isascii(c) && __locale::__iswlower(c, _LIBCPP_GET_C_LOCALE)) ? c - L'a' + L'A' : c;
-#  endif
-}
+wchar_t ctype<wchar_t>::do_toupper(char_type c) const { return to_upper_impl(c); }
 
 const wchar_t* ctype<wchar_t>::do_toupper(char_type* low, const char_type* high) const {
   for (; low != high; ++low)
-#  ifdef _LIBCPP_HAS_DEFAULTRUNELOCALE
-    *low = std::__libcpp_isascii(*low) ? _DefaultRuneLocale.__mapupper[*low] : *low;
-#  elif defined(__GLIBC__) || defined(__EMSCRIPTEN__) || defined(__NetBSD__) || defined(__MVS__)
-    *low = std::__libcpp_isascii(*low) ? ctype<char>::__classic_upper_table()[*low] : *low;
-#  else
-    *low =
-        (std::__libcpp_isascii(*low) && __locale::__islower(*low, _LIBCPP_GET_C_LOCALE)) ? (*low - L'a' + L'A') : *low;
-#  endif
+    *low = to_upper_impl(*low);
   return low;
 }
 
-wchar_t ctype<wchar_t>::do_tolower(char_type c) const {
-#  ifdef _LIBCPP_HAS_DEFAULTRUNELOCALE
-  return std::__libcpp_isascii(c) ? _DefaultRuneLocale.__maplower[c] : c;
-#  elif defined(__GLIBC__) || defined(__EMSCRIPTEN__) || defined(__NetBSD__) || defined(__MVS__)
-  return std::__libcpp_isascii(c) ? ctype<char>::__classic_lower_table()[c] : c;
-#  else
-  return (std::__libcpp_isascii(c) && __locale::__isupper(c, _LIBCPP_GET_C_LOCALE)) ? c - L'A' + 'a' : c;
-#  endif
-}
+wchar_t ctype<wchar_t>::do_tolower(char_type c) const { return to_lower_impl(c); }
 
 const wchar_t* ctype<wchar_t>::do_tolower(char_type* low, const char_type* high) const {
   for (; low != high; ++low)
-#  ifdef _LIBCPP_HAS_DEFAULTRUNELOCALE
-    *low = std::__libcpp_isascii(*low) ? _DefaultRuneLocale.__maplower[*low] : *low;
-#  elif defined(__GLIBC__) || defined(__EMSCRIPTEN__) || defined(__NetBSD__) || defined(__MVS__)
-    *low = std::__libcpp_isascii(*low) ? ctype<char>::__classic_lower_table()[*low] : *low;
-#  else
-    *low = (std::__libcpp_isascii(*low) && __locale::__isupper(*low, _LIBCPP_GET_C_LOCALE)) ? *low - L'A' + L'a' : *low;
-#  endif
+    *low = to_lower_impl(*low);
   return low;
 }
 
@@ -811,59 +796,19 @@ ctype<char>::~ctype() {
     delete[] __tab_;
 }
 
-char ctype<char>::do_toupper(char_type c) const {
-#ifdef _LIBCPP_HAS_DEFAULTRUNELOCALE
-  return std::__libcpp_isascii(c) ? static_cast<char>(_DefaultRuneLocale.__mapupper[static_cast<ptrdiff_t>(c)]) : c;
-#elif defined(__NetBSD__)
-  return static_cast<char>(__classic_upper_table()[static_cast<unsigned char>(c)]);
-#elif defined(__GLIBC__) || defined(__EMSCRIPTEN__) || defined(__MVS__)
-  return std::__libcpp_isascii(c) ? static_cast<char>(__classic_upper_table()[static_cast<unsigned char>(c)]) : c;
-#else
-  return (std::__libcpp_isascii(c) && __locale::__islower(c, _LIBCPP_GET_C_LOCALE)) ? c - 'a' + 'A' : c;
-#endif
-}
+char ctype<char>::do_toupper(char_type c) const { return to_upper_impl(c); }
 
 const char* ctype<char>::do_toupper(char_type* low, const char_type* high) const {
   for (; low != high; ++low)
-#ifdef _LIBCPP_HAS_DEFAULTRUNELOCALE
-    *low = std::__libcpp_isascii(*low)
-             ? static_cast<char>(_DefaultRuneLocale.__mapupper[static_cast<ptrdiff_t>(*low)])
-             : *low;
-#elif defined(__NetBSD__)
-    *low = static_cast<char>(__classic_upper_table()[static_cast<unsigned char>(*low)]);
-#elif defined(__GLIBC__) || defined(__EMSCRIPTEN__) || defined(__MVS__)
-    *low = std::__libcpp_isascii(*low) ? static_cast<char>(__classic_upper_table()[static_cast<size_t>(*low)]) : *low;
-#else
-    *low = (std::__libcpp_isascii(*low) && __locale::__islower(*low, _LIBCPP_GET_C_LOCALE)) ? *low - 'a' + 'A' : *low;
-#endif
+    *low = to_upper_impl(*low);
   return low;
 }
 
-char ctype<char>::do_tolower(char_type c) const {
-#ifdef _LIBCPP_HAS_DEFAULTRUNELOCALE
-  return std::__libcpp_isascii(c) ? static_cast<char>(_DefaultRuneLocale.__maplower[static_cast<ptrdiff_t>(c)]) : c;
-#elif defined(__NetBSD__)
-  return static_cast<char>(__classic_lower_table()[static_cast<unsigned char>(c)]);
-#elif defined(__GLIBC__) || defined(__EMSCRIPTEN__) || defined(__MVS__)
-  return std::__libcpp_isascii(c) ? static_cast<char>(__classic_lower_table()[static_cast<size_t>(c)]) : c;
-#else
-  return (std::__libcpp_isascii(c) && __locale::__isupper(c, _LIBCPP_GET_C_LOCALE)) ? c - 'A' + 'a' : c;
-#endif
-}
+char ctype<char>::do_tolower(char_type c) const { return to_lower_impl(c); }
 
 const char* ctype<char>::do_tolower(char_type* low, const char_type* high) const {
   for (; low != high; ++low)
-#ifdef _LIBCPP_HAS_DEFAULTRUNELOCALE
-    *low = std::__libcpp_isascii(*low)
-             ? static_cast<char>(_DefaultRuneLocale.__maplower[static_cast<ptrdiff_t>(*low)])
-             : *low;
-#elif defined(__NetBSD__)
-    *low = static_cast<char>(__classic_lower_table()[static_cast<unsigned char>(*low)]);
-#elif defined(__GLIBC__) || defined(__EMSCRIPTEN__) || defined(__MVS__)
-    *low = std::__libcpp_isascii(*low) ? static_cast<char>(__classic_lower_table()[static_cast<size_t>(*low)]) : *low;
-#else
-    *low = (std::__libcpp_isascii(*low) && __locale::__isupper(*low, _LIBCPP_GET_C_LOCALE)) ? *low - 'A' + 'a' : *low;
-#endif
+    *low = to_lower_impl(*low);
   return low;
 }
 
@@ -1010,36 +955,6 @@ const ctype<char>::mask* ctype<char>::classic_table() noexcept {
 }
 #endif
 
-#if defined(__GLIBC__)
-const int* ctype<char>::__classic_lower_table() noexcept { return _LIBCPP_GET_C_LOCALE->__ctype_tolower; }
-
-const int* ctype<char>::__classic_upper_table() noexcept { return _LIBCPP_GET_C_LOCALE->__ctype_toupper; }
-#elif defined(__NetBSD__)
-const short* ctype<char>::__classic_lower_table() noexcept { return _C_tolower_tab_ + 1; }
-
-const short* ctype<char>::__classic_upper_table() noexcept { return _C_toupper_tab_ + 1; }
-
-#elif defined(__EMSCRIPTEN__)
-const int* ctype<char>::__classic_lower_table() noexcept { return *__ctype_tolower_loc(); }
-
-const int* ctype<char>::__classic_upper_table() noexcept { return *__ctype_toupper_loc(); }
-#elif defined(__MVS__)
-const unsigned short* ctype<char>::__classic_lower_table() _NOEXCEPT {
-#  if defined(__NATIVE_ASCII_F)
-  return const_cast<const unsigned short*>(__OBJ_DATA(__lc_ctype_a)->lower);
-#  else
-  return const_cast<const unsigned short*>(__ctype + __TOLOWER_INDEX);
-#  endif
-}
-const unsigned short* ctype<char>::__classic_upper_table() _NOEXCEPT {
-#  if defined(__NATIVE_ASCII_F)
-  return const_cast<const unsigned short*>(__OBJ_DATA(__lc_ctype_a)->upper);
-#  else
-  return const_cast<const unsigned short*>(__ctype + __TOUPPER_INDEX);
-#  endif
-}
-#endif // __GLIBC__ || __NETBSD__ || __EMSCRIPTEN__ || __MVS__
-
 // template <> class ctype_byname<char>
 
 ctype_byname<char>::ctype_byname(const char* name, size_t refs)
diff --git a/libcxx/test/benchmarks/locale/ctype.bench.cpp b/libcxx/test/benchmarks/locale/ctype.bench.cpp
new file mode 100644
index 0000000000000..92afeb2ca35f3
--- /dev/null
+++ b/libcxx/test/benchmarks/locale/ctype.bench.cpp
@@ -0,0 +1,69 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// UNSUPPORTED: c++03, c++11, c++14
+
+#include <locale>
+
+#include <benchmark/benchmark.h>
+
+#include "make_string.h"
+
+template <class CharT>
+static void BM_tolower_char(benchmark::State& state) {
+  const auto& ct = std::use_facet<std::ctype<CharT>>(std::locale::classic());
+
+  for (auto _ : state) {
+    benchmark::DoNotOptimize(ct.tolower(CharT('c')));
+  }
+}
+
+BENCHMARK(BM_tolower_char<char>);
+BENCHMARK(BM_tolower_char<wchar_t>);
+
+template <class CharT>
+static void BM_tolower_string(benchmark::State& state) {
+  const auto& ct = std::use_facet<std::ctype<CharT>>(std::locale::classic());
+  std::basic_string<CharT> str;
+
+  for (auto _ : state) {
+    str = MAKE_STRING_VIEW(CharT, "THIS IS A LONG STRING TO MAKE TO LOWER");
+    benchmark::DoNotOptimize(ct.tolower(str.data(), str.data() + str.size()));
+  }
+}
+
+BENCHMARK(BM_tolower_string<char>);
+BENCHMARK(BM_tolower_string<wchar_t>);
+
+template <class CharT>
+static void BM_toupper_char(benchmark::State& state) {
+  const auto& ct = std::use_facet<std::ctype<CharT>>(std::locale::classic());
+
+  for (auto _ : state) {
+    benchmark::DoNotOptimize(ct.toupper(CharT('c')));
+  }
+}
+
+BENCHMARK(BM_toupper_char<char>);
+BENCHMARK(BM_toupper_char<wchar_t>);
+
+template <class CharT>
+static void BM_toupper_string(benchmark::State& state) {
+  const auto& ct = std::use_facet<std::ctype<CharT>>(std::locale::classic());
+  std::basic_string<CharT> str;
+
+  for (auto _ : state) {
+    str = MAKE_STRING_VIEW(CharT, "THIS IS A LONG STRING TO MAKE TO LOWER");
+    benchmark::DoNotOptimize(ct.toupper(str.data(), str.data() + str.size()));
+  }
+}
+
+BENCHMARK(BM_toupper_string<char>);
+BENCHMARK(BM_toupper_string<wchar_t>);
+
+BENCHMARK_MAIN();

@philnik777 philnik777 force-pushed the optimize_toupperlower branch from f80600a to c2f9e66 Compare July 4, 2025 09:16
@philnik777 philnik777 force-pushed the optimize_toupperlower branch from c2f9e66 to ec77924 Compare July 9, 2025 09:25
@philnik777 philnik777 merged commit 44d3769 into llvm:main Jul 9, 2025
91 of 95 checks passed
@philnik777 philnik777 deleted the optimize_toupperlower branch July 9, 2025 14:32

* [libc++] Optimize ctype::to{lower,upper}

This patch removed __classic_upper_table() and __classic_lower_table(), which were only ever accessed from the dylib.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing those function breaks ABI. The z/OS intends to keep them around. Wonder why this was not discussed among different vendors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions were never used outside the dylib, so no program can ever reference them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see these were used inside locale.cpp only so they should never be externalized. In theory once the symbol is exported nothing stops of being called externally though I would think this is unlikely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. This is a libc++-internal function. We've never used it ourselves in the headers, so it's your own fault if you call them. It's even more unlikely anybody even tried to use them, since it's not available on most platforms either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear we never called this function except the internal code which was removed in this PR. After reviewing this internally we decided to remove those symbols. I appreciate your time and the clarity you've provided.

Comment on lines +700 to +713
template <class CharT>
static CharT to_upper_impl(CharT c) {
if (c < 'a' || c > 'z')
return c;
return c & ~0x20;
}

template <class CharT>
static CharT to_lower_impl(CharT c) {
if (c < 'A' || c > 'Z')
return c;
return c | 0x20;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code does not work for non-ASCII encodings since the assumption of char ordering cannot be assumed. Is there any reason why we cannot simply call C functions ::tolower(c) and touppoer(c) here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming system call to :tolower(c) and touppoer(c) are macro expanded and/or get inlined the performance should be equivalent to above implementation, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the classic C locale, not the currently installed locale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants